Objects in R

2023 Bio R Workshop

Author

Prof. Rey R. Cuenca
Math-Stat Dept., MSU-IIT

An Intuitive Framework

One approach to get a partial yet quick understanding of a complex system of ideas is to have a simplified mental picture of it. This same approach is applied when we want to learn R as its learning is quite steep:

The learning curve for R programming is steep due to its unique syntax and extensive set of commands, requiring most new learners to spend four to six weeks mastering it.” - Noble Desktop, (NYC’s Top Design & Coding School Since 1990)

A (over)simplified mental picture for beginners of R is to analogize working in R as cooking. Cooking essentially requires three things:

  1. Ingredients – R objects a.k.a “data containers”
  2. Cooking utensils/equipments – R functions
  3. Recipe – R scripts or Markdown files

Figure 1: Mental picture when working with R

You can think of RStudio’s Console and Source Panes as the “chef’s” (you) cooking table.

Vectors

Probably the most fundamental object that act as “data container” (i.e. data structure) in R is called a vector (also called atomic vectors). Almost all other objects in R that are used by the common user is built up in terms of vectors. Any vector contains three properties:

  1. Type - typeof(), what it is
  2. Length - length(), how many elements it contains
  3. Attributes - attributes(), additional arbitrary metadata

Creating vectors could be done in many ways. However, two of most basic ways depends on the length of the vector:

  1. Length = 1. Directly run a single alphanumeric characters in the Console Pane.
  2. Length > 1. Use the R combine command c().

Characters or Strings

"a"
"a,    b,c"
c("a","b","c")
typeof("a")
typeof("a,    b,c")
typeof(c("a","b","c"))
length("a")
length("a,    b,c")
length(c("a","b","c"))

Numbers

15L
1.0
1 + 2i

c(1L,2L,0L,-15L)
c(1.0,1,4,6,-56,1e-10,1e4)
c(1 + 2i,1,0 - 3i, 3i)
typeof(15L)
typeof(1.0)
typeof(1 + 2i)

typeof(c(1L,2L,0L,-15L))
typeof(c(1.0,1,4,6,-56,1e-10,1e4))
typeof(c(1 + 2i,1,0 - 3i, 3i))
length(15L)
length(1.0)
length(1 + 2i)

length(c(1L,2L,0L,-15L))
length(c(1.0,1,4,6,-56,1e-10,1e4))
length(c(1 + 2i,1,0 - 3i, 3i))

Logical or Boolean

T
F
TRUE
FALSE
c(T,FALSE)
c(T,T,T,T,F,FALSE,F,TRUE,T,FALSE,T)
typeof(c(T,FALSE))
length(c(T,FALSE))
attributes(c(T,FALSE))

Matrix

# Number of entries matches number of elements
matrix(c(1,2,3,4,5,6,7,8), nrow = 2, ncol = 4)
matrix(c(1,2,3,4,5,6,7,8), nrow = 2, ncol = 4, byrow = TRUE)

# Number of entries does not matche number of elements
# Resolved by recycling elements
matrix(c(1,2,3,4,5,6,7,8), nrow = 2, ncol = 10)
matrix(c(1,2,3,4,5,6,7,8), nrow = 2, ncol = 13, byrow = TRUE)
## Example of setting row and column names
matrix(data = c(1,2,3, 11,12,13), 
       nrow = 2, 
       ncol = 3, 
       byrow = TRUE,
       dimnames = list(c("row1", "row2"),
                       c("C.1", "C.2", "C.3")))
cbind(c(1,2,3,4), c(5,6,7,8))
rbind(c(1,2,3,4), c(5,6,7,8))
cbind(c(1,2,3,4),
      c(5,6,7,8), 
      c("A","B","C","D"))

rbind(c(1,2,3,4),
      c(5,6,7,8),
      c(T,F,T,T))

rbind(c(143,243),
      cbind(c(5,6,7,8), 
            c(T,F,T,T)))

Data Frame

data.frame(
  ID = c(1103,1483,5670),
  Name = c("Mark","John","Maria"),
  Age = c(15L,13L,16L),
  BType = c("A","O","B"),
  WVaccine = c(T,T,F)
)
    ID  Name Age BType WVaccine
1 1103  Mark  15     A     TRUE
2 1483  John  13     O     TRUE
3 5670 Maria  16     B    FALSE
dplyr::tibble(
  ID = c(1103,1483,5670),
  Name = c("Mark","John","Maria"),
  Age = c(15L,13L,16L),
  BType = c("A","O","B"),
  WVaccine = c(T,T,F)
)
# A tibble: 3 × 5
     ID Name    Age BType WVaccine
  <dbl> <chr> <int> <chr> <lgl>   
1  1103 Mark     15 A     TRUE    
2  1483 John     13 O     TRUE    
3  5670 Maria    16 B     FALSE   

Lists

A list a vector in “steroids”. While vectors only allows a single type (logical, numeric, etc) of data, lists allows a mixture of different types of data. In other words, a vector is homogeneous type of container while lists is the heterogeneous type.

c(1,2,3)
list(1,2,3)
c(1,"A",TRUE,c(5.4,-4.0))
list(1,"A",TRUE,c(5.4,-4.0))
list(Name1 = 1, Name2 = "A", Name3 = TRUE, Name4 = c(5.4,-4.0))
list(Name1 = 1,
     Name2 = "A",
     Name3 = TRUE,
     Name4 = c(5.4,-4.0))
list("Name 1" = 1,
     "Name 2" = "A",
     "Name 3" = TRUE,
     "Name 4" = c(5.4,-4.0))
list(`Name 1` = 1,
     `Name 2` = "A",
     `Name 3` = TRUE,
     `Name 4` = c(5.4,-4.0))
list(
  `A vector` = 1:10,
  `A matrix` = matrix(1:9, nrow = 3),
  `Another list` = list(Name1 = 1, 
                        Name2 = "A", 
                        Name3 = TRUE, 
                        Name4 = c(5.4,-4.0))
)

Variables and Constants

The same as